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Abstract 

We consider any network environment in which the "best shot 
game" is played. This is the case where the possible actions are only 
two for every node (0 and 1), and the best response for a node is 1 if 
and only if all her neighbors play 0. A natural application of the model 
is one in which the action 1 is the purchase of a good, which is locally 
a public good, in the sense that it will be available also to neighbors. 
This game typically exhibits a great multiplicity of equilibria. Imagine 
a social planner whose scope is to find an optimal equilibrium, i.e. one 
in which the number of nodes playing 1 is minimal. To find such an 
equilibrium is a very hard task for any non-trivial network architec- 
ture. We propose an implementable mechanism that, in the limit of 
infinite time, reaches an optimal equilibrium, even if this equilibrium 
and even the network structure is unknown to the social planner. 
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1 Introduction. 



Take an exogenous network in which otherwise homogeneous players (nodes) 
play a public good game, which is the one defined Best shot game in Galeotti 
et al. (2010) 111 The best shot game is a discrete case, with restricted strategy 
profiles and satiated utilities, of the model in Bramoulle and Kranton (2007) 
and of the second stage of the game in Galeotti and Goyal (2008). The action 
of each node i is an effort Xi and her payoff depends on the aggregate effort 
of herself and that of her neighbors, minus some cost for her own effort. 

Here we restrict strategy profiles to the two specialized actions: Xj G 
{0, 1}|§ In this way x, a vector of specialized actions whose length is given 
by the number of nodes, will characterize any possible configuration of the 
system. We will consider the class of incentives such that, in Nash equilibrium 
(NE), agent i will play action Xi according to the following rule: 

j Xi — 1 if Xj = for all neighbors j of node i\ . , 

Xi = otherwise. 

We will study all the NE of the game: that is all those action profiles in which, 
for any link, not both nodes of the link put in effort 1; but at the same time 
for any node, if we consider the set including itself and its neighborhood, 
at least one node in this set puts in effort 1. Mathematically, the subset 
of nodes playing 1 in a NE will then be a maximal independent set of the 
network, as it is called in graph theory. 

The next example will give some insight on the maximal independent 
sets, our NE, for simple networks. 

Example 1 A network of 9 nodes. 

Figure [T] shows four possible NE for the same network of 9 nodes. Black 
nodes are those playing 1, while all the others are playing 0. The bottom- 
right NE is the only one in which only three nodes play action 1. If we assume 
action 1 to be a costly action, interpreting it as the purchase of a local public 
good, then the bottom-right NE is socially optimal, at least regarding costs. 
□ 

1 Galeotti et al. (2010) give this name in Example 2 and use it throughout the paper. 
The name Best shot game comes from Hirschleifer (1983), where it is however described 
as a non-network game. 

2 One result in Bramoulle and Kranton (2007) is actually that, even when the possible 
actions of nodes are continuous, in a stable equilibrium every agent would play either or 
a fixed value e* > which can be normalized to 1. 
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By considering this last example, a first intuition is that when more con- 
nected nodes play 1, then the number of 1 -players in equilibrium is reduced. 
The extremal case of this will happen on a star-shaped network, as shown 
in the next example. 

Example 2 The star. 

It is easy to see that the star has only two maximal independent sets (see 
Figure [2]): one in which the center alone plays 1, and another one in which 
the spokes do so. If we are looking for efficiency (defined as fewer Is, which 
are supposed to be costly) it is very easy to find that the first case is the best 
one. Suppose that we are in the bad NE (spokes exerting the costly effort), 
then a social planner could shift to the good equilibrium by incentivating a 
contribution from the center. When the center is contributing, then, by best 
response, all spokes stop doing so. This mechanism will be formalized in the 
next section, but the idea is that of incentivating a contribution from agents 
that were not contributing in a NE, thus the system will move to a new NE, 
which may reduce the social cost of being in equilibrium. □ 

The problem of finding all maximal independent sets of a general network 
is however not an easy one and will be discussed in Section [3j This problem is 
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Figure 2: The two NE of a star network. 



actually NP-hardJfl as is the problem of finding those maximal independent 
sets with more or less nodes playing 1. In a companion paper, Dall'Asta, 
Pin and Ramezanpour (2009), we discuss these aspects in more detail for a 
particular class of random networks. The next example may give a hint of 
this. 

Example 3 A regular random network. 

Consider the regular random network illustrated (twice) in Figure [3j It 
has 20 nodes, and each of them has exactly 4 links. In this case we cannot 
propose any strategy that targets as contributors those nodes with many 
links, as could be suggested from previous examples. This network in partic- 
ular has 128 equilibria: 2 (one is in Figure [3j left) with 4 nodes contributing, 
25 with 5, 58 with 6, 42 with 7, and only 1 (Figure [3j right) with 8 nodes 
contributing. In Dall'Asta, Pin and Ramezanpour (2009) we consider such 
networks consisting of a large number of nodes, and we use an analytic ap- 
proach to compute the approximate number of NE as a function of the frac- 
tion of contributors^ The predictions are very accurate when the number 
of nodes is large, but search algorithms are unable to successfully explore 

3 An optimization problem is NP-hard if it is as difficult as any problem in the NP- 
complete (non-deterministic polynomial) class. Consider a general problem whose object 
(input) is characterized by a certain size N (as could be the number of nodes in our case) . 
Here is given a non-rigorous definition: The problem is called NP-complete if there is no 
algorithm that can find a solution to the problem, for any possible input of size N, in a 
time that grows at most polynomially in N. An NP-complete problem is one in which the 
time required to find a solution typically grows exponentially in N. In practice this means 
that, even if a good computer can solve the problem in a reasonable time for N — 1.000, 
the case N = 10.000 may take years to be solved. 

4 By adopting a mean field analysis, Lopez-Pintado (2008) identifies the mean fraction 
of contributors for a typical NE. 
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in finite time the large deviations predicted by the theory (this problem is 
also NP-hard). For small networks, even if regular random, there is a lot 
of variability. Other networks of 20 nodes and degree 4, generated with the 
same random process, have completely different distributions. The only way 
to find all the equilibria in a particular network is to control all the 2 20 ~ 10 6 
possible pure strategy profiles. □ 




Figure 3: Two NE for the same regular random network of 20 nodes 
and degree 4. Picture is obtained by means of the software Pajek 
(http:/ /pajek.imfm.si/). 



From the point of view of economics, the rule specified in ([T]) is not behav- 
ioral and could be justified by several modelling choices with rational agents. 
Up to now we have defined (pure) Nash equilibria without explicitly defining 
actions and payoff's; this however could easily be done. One possibility is 
the following. Any agent attributes utility v to a homogeneous good, if she 
has access to it (independently of whether it is provided by herself or by 
any of her neighbors), and her utility is satiated by one unit of it. Finally, 
the cost of providing the good is a positive value c < v. Since utilities are 
satiated, and in equilibrium every agent has local access to the good, then 
considering efficiency from the point of view of minimal aggregated costs is 
enough to achieve global efficiency In our model agents consider only local 
spillovers and exclude any externality from any other non-neighbor player. 
In this sense the network structure formalizes the range of the externalities. 
Note however that, because of satiation, the utility of agents is not linear 
in the contribution effort of neighbors, so that our model is not included in 
the class of games analyzed by Ballester et al. (2006), hence it cannot be 
solved with the help of Bonacich centrality. Bramoulle and Kranton (2007) 
consider non-satiated utility functions and find the typical public-good dis- 
crepancy between efficient strategy profiles and equilibria. A general class of 
games that includes the one from Bramoulle and Kranton (2007) is analyzed 
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in Bramoulle et al. (2009), however also their class does not include our 
non-satiated utilities. 

In next section we will define formally the general best reply mechanism 
that we consider, and that we implement also in the numerical simulations. 
From a theoretical point of view, it may seem that we exclude full rationality 
when we assume that agents respond to changes with a best response rule 
that considers only the present configuration but is myopic and not strategic 
on possible future new changes. Consider, however, that another explanation 
for agents not being interested in future expected payoffs is a high rate 5 of 
temporal discount. 

The kind of situation we have in mind is that of every agent deciding 
whether or not to exert a fixed costly effort that is beneficial to herself and 
also to her neighbors, so that a typical situation of free riding incentives 
arises. This could be the case with farmers or firms adopting new technolo- 
gies, with an information network and a cost for possible failures]! Another 
application could be that of several municipalities in a given region; the 
public good could be a library or a fire brigade, and two municipalities are 
linked if the public good in one of them makes the same public good un- 
desirable in the other one because of geographical proximity. Finally, since 
the mechanism we propose requires low costs of shifting between strategies 
and repeated interaction, a good application could be that of a big firm en- 
couraging people to share cars in order to minimize parking places. Action 
1 would mean 'take the car' and an employee would play if a friend gives 
her a lift. Generally, in any of these applications there could be a planner 
whose objective could reasonably be that of minimizing costs. 

Suppose that the planner considers all possible NE of the game (all max- 
imal independent sets of the network) and wants to minimize among them 
the number of nodes exerting effort 1 (i.e. find a maximal independent set 
of minimal cardinality: MNE). She could impose the proper action on the 
agents, and the resulting configuration, being a NE, would be stable with- 
out imposing more incentives. Suppose, however, that the planner does not 
know such an optimal distribution (remember that the theoretical problem 
is typically a complex one) or that moreover she may not even know any- 
thing about the network. Assuming that we also have a time dimension, our 
question is: would it still be possible for the planner to build a mechanism 
that would incentivate the agents to move towards an optimal MNET^ Our 

5 This is the application proposed in Bramoulle and Kranton (2007), where they cite 
the applied model in Foster and Rosenzweig (1995). 

6 We will use the term mechanism to differentiate it from algorithm. While the latter 
is intended as a computational technique, the former is a plausible implementation of 
any single step of such a technique into a real system, also allowing the interaction of 
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answer is only theoretical but positive: at the limit of infinite time such a 
mechanism exists, and it will lead to a MNE with probability 1. 

What we assume is that the social planner's goal is to minimize the costs 
of a NE, when she has the possibility of incentivating players' actions out of 
equilibrium, but she is not able to modify the structure of the network. It is 
clear that if the planner had the possibility of changing the network structure, 
directly or by incentives, at a reasonable cost (as is the case considered on 
a different network game by Haag ad Lagunoff (2006)) then the problem 
would look very different. It would be enough to approximate a star-like 
configuration such as the one analyzed in Example EJ and the solution would 
easily be found. 

In the next section we show how we obtain our result. We show that our 
setup is included in the hypothesis of a theorem first proved in Geman and 
Geman (1984) and presented here in Appendix |Aj The proof of this equiv- 
alence is based on three lemmas, whose proofs are in Appendix [Bl Section 
[3] analyzes, mainly by means of numerical simulations, how the simulated 
annealing approach that we propose performs in two very different network 
structures: regular random networks and scale free networks. We conclude 
the paper with Section HI 

2 Main result 

The mechanism we study is defined in discrete time (t = 1,2,3,...). At 
every time step the configuration xt of nodes' actions satisfy condition (Tj[|) 
for every node, and hence is a NE. Suppose then that at time 1 the system is 
in a NE, so that x itl e {0, 1} is a best response for every agent i, as specified 
in (TTJ). The planner does not know anything about the network, the only 
thing she observes at any step t in time is the action of each player and 
hence the aggregate number M t = Yli x i,t of agents playing 1. At every time 
step, she picks an agent i t playing 0, at random with uniform probabilities, 
and induces her to flip her strategy to 10 Let us call this transition F. 
The transition F is defined only from a NE x to a non-NE x'. It defines a 
Markov chain across all {0, 1} vectors x. In consequence of this flip, all the 
other nodes in the network will change their strategy according to the best 
response rule defined here below. 

Consider the subset of unsatisfied agents in a non-NE configuration, i.e. 

self-interested agents. 

This can easily be done through incentives. The reason why the planner is looking for 
a minimum could be that she is financing all the agents exerting effort; in this case she 
could raise her contribution to the agent up to the desired threshold level. 
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any agent for which condition (fTJ) is violated, either because she plays 
and also all her neighbors do, or because she plays 1 and at least one of 
her neighbors do the same. If we apply transition F to a node it who was 
originally playing 0, then the set of unsatisfied nodes includes always elements 
different from i t ; as in a NE there is always at least one node j playing 1 
around any node % t playing 0. The basic step of the best response rule, is 
iterated by picking with uniform probabilities one of the unsatisfied nodes, 
different from i t , and flipping her strategy. Let us call this transition step B. 
This basic step B clearly defines a Markov chain across all {0,1} vectors x, 
whose absorbing states are NE. In PropositionHJwe show that if we start from 
a NE, we apply F once, and then we iterate B, we reach with probability 1, 
and with a limited number of steps, a new NE. We show also that, for the 
scope of this result, we can discard without loss of generality the possibility 
of synchronous updating. It is clear that, in the assumptions of the model, 
F is induced by the planner, while the iteration of B is obtained from the 
endogenous adaptation of the agents, as long as they are not all satisfied. 

When the system is stable again, i.e. again in a new NE, the planner will 
observe a new configuration x^ ew and the new aggregate quantity of l's, call 
it M™ ew . The planner will accept the new configuration with probability 



where e > is a constant. The second probability in (J2J) identifies the level 
of rejection of non-improving changes. 

We start by proving that x^ ew is always a NE for any t (see Lemma [T] 
below). If the planner accepts the new configuration, then x*t+i — %t ew and 
Mt+i = M™ ew , otherwise she will impose reverse incentives so that we return 
to the original configuration]! i.e. x t +i = %t an d M t+ i = M t . 

In the limit t — > oo, the second probability in (j2J) goes to and the 
mechanism will converge to any member of a precise subset of NE. Call 
the subset of such possible NE local minima^ Every MNE is also a local 
minimum. The question is whether the local minimum in which the process 
ends is also a MNE. The aim of this paper is to show under which conditions 
the answer is positive. 

The structure of the proof is the following. We show that we meet the 
conditions required for the application of a known theorem. 

8 This can be done by reverting all incentives to the nodes who changed; they are, by 
following Lemma [U restricted to a local neighborhood. 

9 It is also possible that the mechanism, at the limit t —¥ oo, alternates between more 
than one single NE, if all of them have the same number of l's. Without loss of generality, 
such subsets of NE can simply be included among local minima. 




if M? ew < 
*' otherwise 



t j 



(2) 



8 



Lemma 1 If we start from a NE and invert the action of one node from 
to 1, then the best response rule of all the other nodes in the network will 
imply a new NE. 

Lemma 2 If we start from a NE and invert the action of one node from 
to 1, then the best response rule of all the other nodes in the network will be 
limited to the neighborhood of order 2 of the original node ( %. e. the change is 
only local). 

Lemma 3 It is possible to reach any NE from any other NE with a finite 
number of the following procedures: flip the action of a single node from to 
1 (transition F) and obtain, by iterated best response of the nodes (transition 
step B), a new NE. 

Proposition 4 The probability 7r(e) that the mechanism ends in a MNE, in 
the limit t — > oo ; is strictly positive for any e > 0; it is decreasing in e; and 
finally, there exists an e > such that, for any e < e, we have that 7r(e) = 1 
independently on the initial conditions. 

Proof: consider the set Q of NE of a given finite network, which is a 
subset of all the {0,1} vectors x, and call Nq = \Q\ its finite cardinality. 
Call |x| the number of agents playing 1 in an equilibrium x G Q, and define 
U* = max{|f | : x G ft}, U* = min{|f| : x G Q}, and A = U* — U*. 
If we apply first F to any ieQ and then we iterate B, then by the proof of 
Lemmas [1] and El in a finite number of iterations we reach a new NE x' G Q, 
with x' x This defines a stochastic process X between the states of Q 
which is ergodic because of Lemma [31 

Then, we are in the conditions of Theorem B in Geman and Geman (1984) 
(see Appendix R]) . and e = D 

The lemmas are proven in Appendix [HI by applying the discrete mathe- 
matics of network theory. Lemmas [H and [2] also guarantee that the proposed 
mechanism is well defined. 

The main proposition is obtained by including our setup in the general 
hypothesis of the theory of simulated annealing, first proposed and formalized 
in Kirkpatrick, Gelatt and Vecchi (1983). Simulated annealing is a heuristic 
algorithm based essentially on the increasing rejection probability in a Monte 
Carlo step, as the probability t- e ( M t ew ~ M v in (|2|), for our case. Simulated 
annealing works exactly as described above, finding a global minimum of a 
certain function, avoiding local minima. Theory tells us that, if the number 
of possible configurations is finite, and it is possible to reach any configu- 
ration from any other with basic steps, then a generalization of the above 
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proposition holds. The rigorous proof that applies to our model can be found 
in Theorem B of Geman and Geman (1984), which we discuss in Appendix 
lAl The original proof takes various pages, its intuition is that we are analyz- 
ing a Markov chain of finite possible configurations (all the NE of the game) 
which is ergodic for any finite t. 

In our case, we consider all the NE as the possible states of the system; 
they are finite because the network is finite. Lemmas [T] and [2] define a stochas- 
tic process between the states of the system, and this process is ergodic by 
Lemma [31 We thus meet the conditions that apply in Appendix [A] 



3 Accuracy vs. speed of convergence 

The mechanism that we propose reaches an optimal outcome with probability 
1 but is extremely time consuming. In this section we discuss how in some 
cases the choice of a faster mechanism (i.e. a higher e) could be useful if 
we are looking for almost optimal solutions in shorter time. However, the 
trade-off between accuracy and speed of convergence is very hard to compute 
in general. Simple adaptations of the mechanism may not be useful at all in 
some case, as we show here below by means of computer simulations. 

We run simulations on random regular networks, as the one in example 31 
and on scale-free networks, as the one illustrated in the following example] 10 ! 

Example 4 A scale-free network. 

Consider the random scale-free network illustrated (twice) in Figure HI 
It has been generated with the simple algorithm proposed in Albert and 
Barabasi (1999): it has 20 nodes, and they have an average degree of exactly 

4 links. This network in particular has 48 equilibria: 2 (one is in Figure 
HI left) with 4 nodes contributing, 2 with 5, 6 with 6, 5 with 7, 6 with 
8, 9 with 9, 13 with 10, 4 with 11, and only 1 (Figure 0J right) with 12 
nodes contributing. Other such networks of 20 nodes and average degree 4, 
generated with the same algorithm, have completely different distributions. 
As there is heterogeneity in the distribution of links, a good strategy to find 
efficient equilibria could be that of targeting as contributors those nodes with 



10 It is well known that random regular and scale-free networks do not have some of 
the properties, as clustering or assortativity, that real world networks have (see Newman 
(2003) and Jackson and Rogers (2007) for more discussion), and that other models would 
be more realistic in generating large random networks. However, as we are working with 
small networks of 20 nodes, the two models that we are using provide the necessary distinc- 
tion between a homogeneous and a heterogeneous distribution of links, and differentiations 
on other dimensions are irrelevant. 
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many links. However, in this way we may not find the good equilibrium with 
only 4 contributors, where a node with 10 links is not contributing, while 
two of the four contributors have only 3 links. Also in this class of random 
networks we are able to find and compare all the equilibria only by controlling 
all the 2 20 ~ 10 6 possible pure strategy profiles. □ 



Figure 4: Two NE for the same scale-free network of 20 nodes and av- 
erage degree 4. Picture is obtained by means of the software Pajek 
(http://pajek.imfm.si/ ). 



In the simulations we do the following. First we generate a random net- 
work with one of the two models; then we count all the Nq equilibria of that 
network out of all the pure strategy profiles, and from this information we 
can easily obtain also A = U* — [/*, for that particular network. Then we 
compute the Markov matrix induced, on the set Q of NE, by the application 
of F and the iteration of B to any element of that set. From this we get the 
information about which ones of the NE of that network are also local min- 
ima of the Markov process, and also about which ones are local minima but 
not MNE. Finally, we run simulated annealing on that matrix with different 
values of e > e. 

In principle, as we obtain the Markov matrix, we could apply theoretical 
results (see e.g. the lectures of Catoni (1999)) to approximate the accuracy 
and the speed of convergence that any e would give. The problem is that any 
single different network, obtained with the same model, may have a com- 
pletely different number and distribution of the NE. The theoretical results 
could be applied only for a particular network or for very specific and com- 
pletely symmetric classes of networks, for which the problem of finding the 
MNE is however a trivial one to solve. 

We run the simulation described above with 50 random regular networks 
of 20 nodes and degree 4, and with 50 scale-free networks of 20 nodes and 
average degree 4. We use a log-grid of e's that are multiple of e. The 
factor of multiplicity ranges from 10 to 1000. For any run of the simulated 
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annealing we report the time needed to find a We report also a measure 

of accuracy of the resulting NE. This measure is normalized to be if the 
number of nodes contributing is and is 1 if the contributors are U*\ more 
precisely if the number of contributors is n the accuracy is jfr^-- Note that 
we know the value of U* and U* for each of the networks that we analyze only 
because we are in a completely controlled environment. Finally, we report 
how all the NE are distributed in the 50 networks, by number of contributors, 
and how many of them are local minima of the Markov process but not MNE. 

Results are shown in Figures [5] and [6j The number of time-steps needed 
for convergence on a single realization of simulated annealing on each of the 
50 different networks are shown as box-plots in the left panels: the thick 
lines represent the log-median of the realizations, the edges of the rectangles 
are first and third log-quart iles, whiskers cover all those observation that 
would be in the 99% confidence interval (above or below) if the data were 
log-normal, crosses are outliers outside this range. The distribution of times 
is almost the same in the two classes of networks. 
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Figure 5: Results of the simulated annealing on 50 random regular networks 
of 20 nodes and degree 4, for e ranging from 10 1 to 10 3 . In the left panel 
we have the box-plot of the time needed for convergence; in the center we 
have the box-plot of the accuracy of the the algorithm (normalized to be 
optimal at 0); in the right panel we have the distribution of NE (black) in 
the 50 networks and those NE (in grey) which are local minima of the Markov 
process, but not MNE. 



The accuracy of simulated annealing is reported in the center panels. 
The scale and the box-plot on the y-axis is now linear. Simulated annealing 
performs better on random regular networks, where, even if e = 1000 ■ e, 
at least half of the realizations converge to the MNE. This is clearly not the 

11 We assume that the simulated annealing algorithm that we run converges when it 
does not change for 10 4 steps, and a threshold is set at 10 7 steps. 
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Figure 6: Results of the simulated annealing on 50 random scale free networks 
of 20 nodes and average degree 4, for e ranging from 10 1 to 10 3 . The three 
plots have the same legend as in Figure [5j 



case for the scale-free networks. The reason for this is not that the scale-free 
networks have more local minima. The right panels report the frequency of 
all the NE, and that of non-MNE local minima (in grey), as a function of the 
number of contributors. For the 50 random regular networks the variance of 
contributors between NE is much smaller (the network in Example [3] is an 
exception), and even if almost 28% of the NE are non-MNE local minima, 
the density of contributors they have is very close to the minimal one. For 
the 50 scale-free networks, NE are much more heterogeneous in the number 
of contributors, and even if only 6% of them are non-MNE local minima, 
they can have many more contributors than the MNE have. 

The main insight from the simulations is that, for some networks, the 
simulated annealing approach, that we implement in our mechanism, works 
well and fast even for e ^> e, while this is not the case for others. And 
this distinction is not trivial. In regular random networks it is actually very 
difficult to argue ex-ante who are the contributors in the MNE, because of the 
full homogeneity between them. However, a fast version of the mechanism 
is reasonably accurate in such networks because the Markov process induced 
by the mechanism itself does not get trapped in the local minima far from 
the MNE. On the other hand, in scale-free networks one could approach a 
NE with a low number of contributors by targeting the hubs, i.e. those nodes 
with many links. This strategy will probably find only a local minimum, and 
this problem arises even when we run our mechanism. A fast version of the 
mechanism is not accurate in such networks because such local minima may 
have many more contributors than the MNE of the network. 
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4 Short considerations 



The problem of finding a MNE among all the NE is in general not a trivial 
one, and the difference between the aggregate number of nodes playing 1 
in NE could vary significantly even in homogeneous networks, as shown in 
Example^ The star structure (Example[2]) is a trivial but dramatic example: 
there are two NE, one in which the center alone plays 1, and another in which 
all the spokes do so and the center free rides. 

The main practical problem in the implementation of the mechanism we 
propose is clearly the necessity of infinite time. This paper is only theoreti- 
cal. However, simulated annealing is used in practice in many optimization 
problems^! For any e > the system will reach a local minimum, which can 
be easily identified even in finite time (the higher the e the faster the conver- 
gence). Noting that the values e < e are typically irrealistically low, and that 
the algorithm therefor converges very slowly, the choice of a proper heuristic 
e > e could be appropriate. This choice would depend on a profit/costs com- 
parison but also, in the case of finite time, on the structure of the network 
(e.g. the star needs a single flip to move from the bad NE to the MNE). 
As shown in Section |3j a particular care should be applied because for some 
networks there is a concrete risk of finding a local minimum that is very 
inefficient 

Finally, even if the planner does initially not know the real structure of 
the network, she could infer it link by link as the steps of the mechanism 
are played. In this way she could mix the mechanism with a theoretical 
investigation, and could target nodes non-randomly in order to maximize the 
likelihood of finding the desired MNE. The analysis of such a sophisticated 
approach would be much more complicated. What we give here is an upper 
bound that, we prove, holds exactly (even if in the limit of infinite time). 
Any improvement on this naive mechanism will work as well, faster, but not 
in finite short time for any possible network, because the original problem is 
NP-hard. 

Appendices 

A Theorem B in Geman and Geman (1984) 

Geman and Geman (1984) is a pioneering theoretical paper on computer 
graphics, studying the best achievable quality of images. Sections X to XII 

12 Crama and Schyns (2003) is a good example related to finance. 
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are devoted to the general case of optimization among a finite number of 
states. We find there a general theorem (Theorem B at page 731) proving 
a conjecture on the Simulated Annealing heuristic algorithm proposed by 
Kirkpatrick, Gelatt and Vecchi (1983). The arising popularity of Simulated 
Annealing has attested the success of Geman and Geman (1984), which is 
now cited (according to scholar, google, com in January 2010) by almost 10000 
papers from all disciplines. 

In this appendix we summarize what is necessary for us from this result, 
with some of the original notation but avoiding most of the thermodynamics 
jargon. Suppose that there is a finite set Q of states, and a function U : 
Q — > R + , so that, for any w G O, U(u) is a positive number. Call U* = 
max ug jj U(u) the maximal value of U, U* = min^gQ U(u) its minimal value, 
and Q = argmin^g^ U{u) those states whose value is Suppose moreover 
that we have a fixed transition matrix X between all the elements of Q and 
that this stochastic matrix X is ergodic, i.e. there is a positive probability of 
reaching any state u' G £1 from any other state u" G Q. Given any u) G £1, call 
X{ui) all those states that can be reached from u with positive probability, 
through X, with a single step. 

Consider now a discrete time flow with t = 1,2,... and the following new 
stochastic process. u\ is any member of Q. Imagine that, at time t, the 
process is in the state ui t , then apply X from u t , obtaining a state that we 
call u™ ew . We now define u) t+ \ as 



new ;u i 1.1., fl if U{U^°) < Ufa) 

u t with probability j t - e{u{uirn . UM) otherwise 



uj t otherwise. 



(3) 

The probability t e ( f/ ( aJ " ei ") u ( u ^)) in ([3]) identifies the level of acceptance of 
non-improving changes, which is declining in time at a rate that depends on 
the constant e > 0. Any such stochastic process will be identified by uq and 



e: call it P W0;£ . 



It is easy to prove that at the limit t — > oo any realization of P UQ>e will 
end up in a set of local minima fl t C Q. Q e is such that, for any to', uj" G fl e 
and u x G X(u'), U(u') = U{u") and U(u') < U(uj x ). 

The theorem imposes a single condition on e so that the local minima 
obtained through Puj ,e are also global minima. 

Theorem B: call Nq the cardinality offl and A=U* — U*. If e < e = jj^, 
then Q e C Q for any realization o/P W0)6; independently of uq. 
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The proof is by no means trivial, it takes various pages and it is heavily 
based on the ergodicity of the system. In Geman and Geman's notation, what 
they call temperature is 7^7- They prove, moreover, that, in the presence 
of more global minima, the probabilities of ending in any one of them are 
uniform. 

B Proof of Lemmas 

Consider a finite network and call X{ G {0, 1} the action of node i, so that x 
is the vector of the actions of all the nodes. Call iV/ the set of nodes which 
are first neighbors of node i, and Nf those which are second neighbors of 
node i. 

We also need the following definitions. A set of nodes in a network is 
an independent set if, for every link of the network, not both its nodes are 
in the set. A set C of nodes in a network is a covering if, for every node 
i, C PI ({i} U iV/) 7^ (i.e. if for any node i we consider the set made of i 
itself and its first neighbors, then at least one of them is also a member of 
C). A set of nodes in a network is a maximal independent set if it is both an 
independent set and a covering. In our notation a maximal independent set 
is characterized by those nodes playing 1 in a NE x. 

Finally, remember that we have defined the basic transition step B of best 
response as a Markov process across all states x, where an unsatisfied node 
(if existing) is picked with uniform probabilities, and her action is flipped. If 
there are no unsatisfied nodes a; is a NE and an absorbing state for B. 

Proof of Lemmas [TJ and [2} suppose that x% — 1 and we flip her action 
so that xf ew = 0. Consider now any node j in N^, it is clear that Xj = 
since x± — 1. All and only new unsatisfied nodes will be all those j G N} 
such that Xk = for any k G A^\{i}@ If we apply the transition step B to 
one of them, call her j, she will be satisfied again and all her neighbors will 
be, because if j is such that Xj = and x™ ew — 1, it is surely the case that 
any k G Nj\{i} was playing Xk = and remains at x^ ew = 0. 

It could be the case that two such j's that are both neighbors of % and 
together, are unsatisfied after z's shift. The fact that one of the two may be 
chosen instead of the other in an iteration of B is the only random part in 
the best response rule. 

As the neighbors of i are finite B needs to be iterated at most \N^ \ times 
and the propagation of best response is limited to iV/. 

13 If this set is empty, then the only unsatisfied node is i, but as we will prove below it 
cannot be the case if we start by applying Ftoa node who was originally playing 0. 
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Note: a best response from to 1 applies only to nodes that are playing 
0, are linked to a node which is shifting from 1 to ; and that node is the only 
neighbor they have who is originally playing 1. 

Suppose now that Xi = is chosen by the stochastic transition F, and 
we flip her action so that x™ ew = 1. The nodes j in Nf who were playing 
Xj = will continue to do so, as they will remain satisfied and B will not 
apply to them. Any node j in A^ 1 (at least one) who was playing Xj = 1 
may be selected by B and will then move to x™ ew = 0. Note that there is no 
indeterminacy in how they will be selected by B, as they cannot be neighbors 
together, as they were all playing 1 in a NE. 

By the previous point this will create a propagation, through B to some 
k G Nj, but not i, who is now satisfied, and not even to any other k G 
Nj fl Nf, for the same reason. This proves that the propagation of the best 
response B is limited to Nf, and that it ends in a new NE in a number of 
steps that is at most \Nf U Nf |. □ 

Proof of Lemma [31 we proceed by defining intermediate NE x 1 , x 2 . . . 
between any two NE x and x! . will be obtained from x™ by flipping one 
node from to 1 (through F) and waiting for the best response (the iteration 
of B, which has been proved above to be finite). 

If two NE x and x! are different, it must be that there is at least one %\ 
such that X{ x = and x' i± = 1 (it is easy to check that any strict subset of 
a maximal independent set is not a covering any more). Change the action 
of that node so that x} ± = x' ix = 1. By previous proof this will propagate 
deterministically to Nf ± and, for all j G Nf x , we will have xj = x'j = 0. 
Propagation may also affect Nf but this is of no importance for our purposes. 

If still x 1 7^ x' , then take another node i 2 such that x\ 2 = and x' i2 = 1 
{%2 is clearly not a member of iV/ U {ii})- Pose x 2 2 = x' i2 = 1, this will 
change some other nodes by best response, but not j G N^ U {h}, because 
any j G A^ 1 can rely on x\ = 1, and then also xf = xj = 1 is fixed. 

We can go on as long as x™ ^ x', taking any node i n+ i for which x™ n+l = 
and x' in+i = 1. This process will converge to x" 1 — > x! in a finite number of 
steps because: 

• when i n+ i shifts from to 1, the nodes j G Ufe=i {^L U Ofe}) w ^ no ^ 
change, since they are either 0-players with a 1-player beside already 
(the 1-player is some i^, with h < n), or a 1 (some ih) surrounded by 
frozen 0's; 

• by construction it is never the case that i n+ \ G Uft=i {j^l h 
because for all j G U^=i {Nh U {ih}) we have that x™ = 
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• the network is finite. □ 



In the above proof, the shift from x to x' is done by construction re- 
defining the covering of any x" 1 from the covering of x! . It is always certain 
that, by best response, any x" 1 is also an independent set. 
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